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Abstract: 

A statistical analysis is performed for a random unrestricted local crew scheduling problem, ex- 
pressed in terms of pairing arrivals with departures. The analysis is aimed at understanding the 
structure of similar problems with global restrictions, and estimating their difficulty. The methods 
developed are of a general nature and can be of use in other problems with a similar structure. For 
large random problems, the ground-state energy scales like \N and the average excitation like N, 
where N is the number of arrivals/departures. The average ground-state degeneracy is such that 
the probability of hitting an optimal pairing by chance scales like 2N2~ N for large N. By insisting 
on the local ground-state energy for a restricted problem, airports can be split into smaller parts, 
and the state space reduced by typically a factor ~ 2^°, with iV a the total number of airports. 
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1 Introduction 



Airline crew scheduling represents an important class of optimization problems where the topological 
structure is important. In ref. Q, a novel Potts artificial neural network approach is developed 
for attacking semi-realistic airline crew scheduling problems of the following type: A weekly flight 
schedule is given in terms of a set of flights, each with a specified airport and time for departure and 
arrival. The object is to assign a crew to each flight, while minimizing a cost-function defined by the 
total required crew time (including the waiting-time at airports). The solution is subject to a set 
of global constraints: The crews are required to travel along closed tours, starting and ending at a 
certain airport, the home-base. These tours are subject to limitations as to duration and leg-count. 

As shown in ref. a great deal of simplification is gained by reformulating the problem as 

that of mapping arrivals on departures at each airport, implying an implicit representation of the 
crews. In fact, without the global restrictions, the problem is reduced to a set of independent local 
subproblems, one at each airport. Each local problem amounts to minimizing the local waiting-time, 
and is simply solvable in polynomial time. 

In this paper, we focus on this kind of unrestricted local problems, in particular their statistical 
properties. These are quite interesting, and in no way trivial, in spite of the triviality of the problems. 
In particular, we consider the ensemble of random local problems of a fixed size N, as defined by 
the number of arrivals/departures. In addition, we analyze the properties of random solutions to 
such problems. 

Such a statistical analysis of this type of problem does not exist in the literature, and we feel it 
is interesting for the following reasons: The results illuminate the structure of the corresponding 
restricted problems, and as a by-product, useful tools are provided for probing their difficulty, and 
for simplifying their solution. Some of the methods used are novel, and may be used also in other 
contexts, where a similar structure occurs. In addition, a lower bound to the waiting-time is provided 
by the solutions to the unrestricted problem. This bound is often saturated JjJ. 

The methodology we use contains the following steps. First, the analysis of a local problem is 
simplified by considering its topology (defined by the relative ordering in time between arrivals 
and departures) separately from its geometry (defined by the lengths of the time intervals between 
consecutive events). The ensemble of problems is thus factorized into the direct product of the 
ensemble of topologies and the ensemble of geometries. 

After introducing a notation for the topology, we consider problems with a fixed topology, and 
evaluate averages over the geometry of entities related to the waiting-time. These are simple, since 
the effect of the geometry on the waiting-time spectrum is a mere shift. 

The apparent difficulty of a problem is probed by analyzing the distribution of waiting-times of 
random solutions. A nice feature is that the waiting-time spectrum for each problem is quantized 
in steps of the basic period of the schedule. 

Subsequently, all variables of interest are averaged also over the topology. This is a more difficult 
task, and requires the use of a subtle recursive method. 

An alternative measure of the difficulty of a problem is the ground-state degeneracy, i.e. the number 
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of solutions with minimal waiting-time, as compared to the total number of solutions, given by TV!. 
This is independent of geometry. Due to the character of the dependence on topology, non-standard 
methods are required to compute the average over topology. 

This paper is organized as follows: In Section 2, the problem ensemble is defined. In Section 3, a 
formalism is introduced for characterizing the topology. The degeneracy structure as a function of 
topology is analyzed, and various energy moments for fixed topology are computed, by averaging 
over geometry and/or pairing. In Section 4, a statistical analysis of the detailed degeneracy structure 
is considered. In particular, the average ground-state degeneracy is computed. Section 5 contains 
our conclusions. 



2 Unrestricted Crew Scheduling 
2.1 The Local Problem 

A local problem of size N is defined by specifying the times for N arrivals (orr's) and N departures 
(dep's), denoted respectively by tf and tf, i = 1 . . . N . The object is simply to find a one-to-one 
mapping (a pairing) between the arr's and dep's, such that the energy (or objective function) E, 
given by the total waiting-time, is minimal. In general this can be done in more than one way, 
implying a degeneracy of the ground-state. 

The pairing of an arr A with a dep D implies that the crew assigned to A should next be assigned 
to D. The periodicity of the schedule implies that any arr may be mapped on any dep: if the dep 
is earlier, it is taken as the same dep in the next period. 

In what follows, we will use the period as the unit of time (and thus energy). Then the times for 
the arr's and dep's can be restricted to the unit interval. If an arr i is mapped on a dep j, their 
contribution to the total waiting-time is given by 

$ = modi G [0,1]. (1) 

Thus, whatever the mapping, the energy E is restricted to the interval [0, N], and between different 
pairings it can only change by an integer amount. Pairings yielding the lowest possible energy Eq 
are said to belong to the ground-state, while a pairing with E = Eq + k, k > is said to belong to 
fc:th excited state. In figure 111 an example of a flight schedule is depicted. 
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Figure 1: An example of a local flight schedule. 
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2.2 The Random Local Problem 



A random local problem is defined by independently choosing the times for N orr's and N dep's 
randomly on the unit period. It is useful to divide the characterization of a given problem in 
two parts: its topology and its geometry. The topology is defined by the relative ordering in 
the combined (cyclic) sequence of arr's and dep's. The probability is the same for every distinct 
topology. The geometry is defined by the sizes of the 2N inter-spaces Xi, i = 1, . . . , 2N, into which 
the period is divided. 

For a given problem, the solution space is defined by the set of pairings, i.e. the N\ possible 
mappings between arr's and dep's. 

We will be interested mainly in the statistical properties of the following entities: the ground-state 
energy (minimal waiting-time) E , the energy E of a random pairing, and the (integer) difference 
D = E — E n defining the excitation energy. We will also be interested in the degeneracies of the 
ground-state and the excited states. To that end, we will consider three kinds of averages: over 
respectively the topology, the geometry, and the pairing. 

The degeneracies of the ground-state and the excited states depend only on the topology, i.e. the 
combined ordering of arr's and dep's. 

For a fixed topology, the excitation energy D is independent of the geometry, and depends entirely 
on the pairing. Conversely, the ground-state energy Eq obviously is independent of the pairing, and 
depends only on the geometry. Thus, for a fixed topology, D and Eo are completely uncorrelated, 
in the combined ensemble of random geometries and random pairings. 



3 Analysis for Fixed Topology 
3.1 Characterization of the Topology 

A simple way to achieve a ground-state pairing (i.e. solve the problem) for a given topology is as 
follows: 

1. An arr immediately followed by a dep is paired with that dep, and both are removed from the 
sequence. Note that the dep could be in the next period. 

2. The process is continued until all arr's and dep's are used. 

As an example, consider the sequence [AiA2A 3 D 1 D 2 A4D 3 D4\, corresponding to the topology of 
the schedule in figure |l|. 

• Pairing A3 with D\ leaves [A1A2D2A4D3D4]. 

• Pairing A2 with D2 leaves [A1A4D3D4]. 
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• Pairing A4 with D3 leaves [A1D4]. 

• Pairing A\ with D4 finishes the process. 



A graphical representation of the topology is now defined as follows. 

• If necessary, rotate the sequence such that no pairing crosses the interval border. 

• Represent the arr's and dep's by equally spaced points on the interval. 

• For each pairing in turn, draw a line from the arr to the dep on the lowest level (one). All 
previously drawn lines that overlap with the new line are lifted one level. 

Thus a set of lines at different levels are obtained, each line starting at an arr and ending on a dep. 
For the example above, the result is shown in fig. 0. Each line represents a crew waiting for a dep. 



Figure 2: Graphical representation of the topology [AAADDADD] 

It turns out that for the properties of the total energy spectrum, full knowledge of the topology is 
not necessary; it suffices to know how many lines there are at the different levels. Thus let Pk be 
the number of lines at level k (k = 1, 2, 3, . . .). For the example above, the P sequence is [1, 2, 1], 
i.e. Pi = 1, P 2 = 2, P 3 = 1, while P fc = for k > 3. 



3.2 Degeneracy Structure 

The ground-state degeneracy is now given by the product of the line-levels, i.e. 

,9o = n fcPfc ' ( 2 ) 

k>l 

This is because each dep must terminate a line, and the number of available crews is equal to the 
number of lines alive at that point, which is given by the line's level. For the example we get 
3x2x2x1 = 12, corresponding to half of the 24 possible pairings. 

Naively, the degeneracy of the mth excited state can be obtained by adding m dummy lines, covering 
the entire interval; they represent extra crews. Then each dep has m additional crews to choose 
between, and we get for the naive degeneracy^ 

a m = l[(k + m) Pk , (3) 
fe>i 

4 For the example above, this precisely corresponds to the spectrum 0|n>= a n \n> of the quantum-mechanical 
operator O = aaaata^aatat , where at, a are harmonic oscillator creation and annihilation operators, satisfying 
[a, at] = 1, and |n>oc (at j |0> is the n:th excited state. 
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where we have assumed indistinguishable dummy lines; otherwise we would have an extra factor 
m\. This defines an infinite sequence. 



However, some of the possible pairings will contain permanently grounded crews (closed lines), or 
lines extending over more than one period. The contribution to the naive multiplicity from solutions 
with n grounded crews and/or excessive periods depends on the proper degeneracy n steps down. 
Denoting the proper degeneracies by g m , the relation is 



/ N + n \ 



(4) 



where the last factor is a binomial coefficient. This represents a kind of renormalization, and can 
be inverted to yield the proper degeneracies 

9m = J2 a m-n(-) n ( ^ 1 J • (5) 
n=0 ^ ' 

This must define a finite sequence, since g m > and the total number of pairings, ^2 m 9m — N\, is 
finite. 

From eq. (^), it is obvious that the naive degeneracy a m is an A^th degree polynomial in m; hence 
it can be written as 

with some coefficients c^. Define the generating functions 

A{x)=Y j a m x m , G{x)=Y.9mX m . (7) 

m m 

Due to eq. (||), these are related by G{x) = (1 — x) N+1 A{x). Then, in terms of Cfc, we have 

N 

A(x) = Y,c k (l-x)- k -\ (8) 

k=0 
N 

G(x) = X'-- 1 ( 9 ) 

k=0 

From this, we see that the g sequence is indeed finite: g{m) = for all m > N. The individual 
degeneracies g m can be obtained from G and its derivatives at x = 0, 

50 = G(0) = ^ c fc , (f0) 

k 

N 

9i = G'(0) = -^(7V_ fc ) Cfc , 

fe=0 

etc. 
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3.3 The Excitation Energy for a Random Pairing 



Conversely, the moments over g m are readily obtained from G and its derivatives at x = 1, 

$> m = G(l) = C N , (11) 

m 

y^raffm = G'(l) = -c/v-i, 

m 

^m(m-l) ffm = G"(l) = 2 CA r_ 2 , 

etc. In order to relate c& to Pfc, we can express the coefficients of the polynomial a m in two different 
ways. From eq. (||) we get for the leading coefficients 



1 E fe ^+ ! V £ feP * -E fc2p * +•••> ( 12 ) 



a m = to + to 

fc 

while eq. ([|) gives, upon expanding the binomial coefficients, 
m N to^" 1 fl 1 

<*m = + -^ r (2 iV(iV + 1)CiV + iVciV - 1 / (13) 

m w-2 r i i 1 

+ -^p | ^N{N - 1)(JV + l)(3iV + 2) CA r + -N 2 (N - l) Cjv _i + iV(JV - l) CA r_ 2 | + . . . 

From this we obtain the relations 

c N = m, (u) 

, f 1 iV+ ll 



= Nl 



(N +1)(3N -2) 



etc. This gives (as it should) J2 m 9 m ~ * ne num ber of possible pairings. In a given topology, 
the fraction of pairings having an energy D — to steps above Eq is given by g m /N\. Thus, the first 
few moments of the excitation energy D for a random pairing are: 



1 N + 1 1 , 

m k 

<D 2 > P = ^E m2 5™ ( 15 ) 

m 

- iV(iV-l) lV " J N{N-l)^ kFk N l^ kFk+ 12 

\ k / k k 

where <> p denotes the average over pairings for a fixed topology. 
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3.4 The Ground-State Energy in a Random Geometry 

For a given topology, the ground-state energy depends on the geometry, and is given by 

2N 

EQ = ^k iXi , (16) 

i=l 

where xi is the length of the ith sub-interval, and ki the number of lines in that interval. 

To perform averages over a random geometry, we need to analyze the distribution of the intervals 
Xi. Independently of the topology, they obey the distribution 

dP = (2N — 1)! S x i ~ ^ II ( e (^) d ^) • 

For a single Xi, this implies the distribution 

f( Xi ) = (2N - 1) (1 - x t ) 2N - 2 , < Xi < 1. (18) 
Using the identity ^ Xi = 1, and the permutation symmetry between different Xi, we have 

1 , X 

<*i> = ^f, (19) 

<XiX '> = 2Nwliy (20) 

In what follows, we will also need the number of intervals with k lines, to be denoted by Q k \ in 
terms of Pk (defining P = 0) it is simply 

Qk=Pk+Pk+u k>0, (21) 

since adding a line on level k implies adding two intervals, with respectively k and k — 1 lines. 
Obviously, we have J2k Q k = '. 

Now we are ready to compute the first few moments of E , yielding 

2N 1111 

<Eo> g = ^h<Xi> a = —^h = —^kQ k = jj^2kP k --, 

i—l i k k 

< E l>g = < XiX i >= 2N(2N + 1) (1 + 

^-yy ( J] klQ k Qi + ]T k2 Qk J (22) 
^ 4 ^ /cP fe ^) + 2 ^ fc 2 P fe - 2(27V + 1) £ kP k + N(N + 1) j , 



27V(27V- 



2N(2N 

where <> s denotes average over the geometry for fixed topology. 
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3.5 The Total Energy 



The total energy E is given by the ground-state and excitation energies, E = Eq + D. Averaging 
over both geometry and pairing, we obtain the surprisingly simple result 

N 

< E > gp = < E > g + < D > p = — , (23) 

independently of topology. 



Since we already have the second moments of Eq and D, we only need < EqD > to compute the 
second moment of E. Using the fact that E and D are statistically independent for a fixed topology, 
we get 

< E D > gp = < E > g < D > p = -± ( J2 kPk) + ^ £ kP k ^±1. (24) 

V k ) k 

This implies 

2 _ 2(iV 2 + iV+l) / v , V 3(7V + 1) v 2 (iV + l)(6iV 2 -iV + 4) 

< >9P N 2 (N -1)(2N+1) / N(N-l)(2N+l)^ k+ 12(2JV + 1) 

(25) 



4 Topology Statistics 
4.1 The Topology Ensemble 

We now want to perform averages also over the topology, for the moments of the ground-state energy 
Eq, the excitation energy D, and the total energy E. In table |] a complete list of topologies for N 
up to four is given, along with various characteristics. Note the similarity in symmetry-properties 
between the P, Q and g sequences. 

To compute < P k >, we must analyze the number of ways m[P] to obtain a given P sequence. 



• The lowest level lines define P\ non-overlapping subgroups; these are cyclically indistinguish- 
able, so the naive multiplicity must be divided by P\. 

• For the P k indistinguishable lines on a higher level k, there are Pk-i possible lines on the 
previous level to put them above. This can be done in ( ^ k ^p 1 ^ ^ ways. 

• In addition, there are 27V cyclic rotations of the AD sequence that yield the same topology. 

• The P sequence must sum up to N, and P\ must not vanish. 
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This gives for the multiplicity m[P] 



ro ^= pr^-E^Hl p fe+1 • (26) 



fe=i ^ 



The total number of possible topologies for a fixed N is simply the number of possible AD sequences, 

i.e. the number of distinct orderings of a sequence of TV A's and iV D's. This is given by 

This should match the total multiplicity Mn = Ylip] m ([P])i which can be recursively obtained 
from m[P] by expanding the Kronecker delta in terms of a complex integral along a small contour 
C around the origin: 

By iteratively using the identity 

z*x; ( a+ a -~i 1 )«> , = ( ? ) (-■»>> = (t^)°- < 28 > 

h=0 v 7 b=0 v 7 v 7 

when summing m[P] Y[ z Pk over the Pfc in order of decreasing k, we obtain convergence of the power 
factor to oj Pk , with lo given by 

lu = — L_ =► W = I (1 - VT^Ii) • (29) 



N 


Pattern 


Prob. 


{P k } 


{Q k } 




{9i} 


(£o> 


(D) 


1 


AD 


1 


[1] 


[i,i] 


1,2,3,... 


(1) 


1/2 





2 
2 


AADD 
ADAD 


2/3 
1/3 


[1.1] 
[2] 


[1,2,1] 
[2,2] 


2,6,12,... 
1,4,9,... 


(2) 
(1,1) 


2/2 
1/2 




1/2 


3 
3 
3 
3 


AAADDD 
AADADD 
AADDAD 
ADADAD 


3/10 
3/10 
3/10 
1/10 


[1,1,1] 
[1,2] 
[2,1] 
[3] 


[1,2,2,1] 
[1,3,2] 
[2,3,1] 
[3,3] 


6,24,60,... 
4,18,48,... 
2,12,36,... 
1,8,27,... 


(6) 
(4,2) 
(2,4) 
(1,4,1) 


9/6 
7/6 
5/6 
3/6 




1/3 
2/3 
3/3 


4 
4 


AAAADDDD 
AAADADDD 


4/35 
4/35 


[1,1,1,1] 
[1,1,2] 


[1,2,2,2,1] 
[1,2,3,2] 


24,120,360,... 
18,96,300,.. . 


(24) 
(18,6) 


8/4 
7/4 




1/4 


4 
4 
4 
4 


AAADDADD 
AADAADDD 
AAADDDAD 
AADADADD 


4/351 
4/35/ 
4/35 
4/35 


[1,2,1] 

[2,1,1] 
[1,3] 


[1,3,3,1] 

[2,3,2,1] 
[1,4,3] 


12,72,240,.. . 

6,48,180,.. . 
8,54,192,.. . 


(12,12) 

(6,18) 
(8,14,2) 


6/4 

5/4 
5/4 


2/4 

3/4 
3/4 


4 
4 
4 
4 


AADDAADD 
AADADDAD 
AADDADAD 
ADADADAD 


2/351 
4/35/ 
4/35 
1/35 


[2,2] 

[3,1] 
[4] 


[2,4,2] 

[3,4.1] 
[4,4] 


4,36,144,.. . 

2,24,108,.. . 
1,16,81,256,... 


(4,16,4) 

(2,14,8) 
(1,11,11,1) 


4/4 

3/4 
2/4 


4/4 

5/4 
6/4 



Table 1: Incquivalent topologies for N < 4. For each topology, the third column gives the probability 
of occurrence, while the next two give its Pk and Qk sequence, respectively. In the following 
two columns the at and gi sequences are given, whereas the last two columns give the average 
ground-state energy and excitation, respectively. Note that different topologies might have identical 
characteristics in terms of Pk, etc. 
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Thus, the total multiplicity is given by 

f dz z~ N v-^ oj(z) Pi fdzz~ N , , . \ \ \ ( 2N . 

M N = 2Nf c — ^^=2Nf c — { -lo 9{ l-. m ={ N ] ,,0, 



obtained by extracting the z N coefficient of — log(l — u>{z)) as ^ f „ ) , by doing the integral 
in terms of w using z = w(l — w). 



4.2 Distribution of Pi 

The number of non-overlapping groups of lines is given by the number of lines at level 1, i.e. P±. In 
fact, w(z) can be seen as a generating function, 



2 ^— ' V ti I 2 

n— 1 7i— 1 x 7 n— 1 

for the number of ways {m^h 1 ) to arrange n lines in such a group. The distribution of Pi is easy 
to compute by slightly modifying eq. (|3(]): skipping the summation over Pi and dividing by Mjv 
yields 

/ 2N\J C 2iriz Pi (27V- l)!(iV-P 1 )! v y 



AT 

for Pi > 0. For large Pi < TV, this is close to 2~ Pl . 

The grouping of lines corresponds to a grouping of the flights, with a matching number of arr's 
and dep's in each group. In a ground-state pairing, flights in different groups are never paired, 
and within a group, the arr has to precede its paired dep. This fact can be used to simplify also 
restricted problems. 



4.3 Moments of P k 



Similarly, < P/. > can be obtained from realizing that inserting a factor in the sum over gives 
a factor Pk_iuj/(1 — uj). This yields in the end an extra factor Pi{u/(1 — uj)) k ^ 1 in the Pi sum: 



i / \ fc-i 
dz I us 



M N <P k > = ™1^-*Tt( — ) E"* (33) 



dz ( u> 



Pi>i 



2N f c 2^n{—) ^ 

f dw(l -2oj) 

2N J c 2^-^(1-^+^ ' (35) 



10 



where the last expression is a reformulation in terms of a loop integral around to = 0. In a similar 
way < PkPi > etc. can be computed. We obtain 

2N\(2N-l\ / 2N-1 \\ 2k ( 2N \ tOB . 
<Pk> = M^U N-k)-{N-k-l )\ = M^[N-k)' ^ 

From these expressions, we can derive the following particular averages, needed to compute the 
various energy moments over the topology: 




where the approximate form in the top equation is valid for large N. 



2N J I 2N \ ( 
<PkPm> = M^\{N-m )-( 



4.4 Full Energy Averages 

We are now ready to compute the final averages also over the topology. Inserting the results of eqs. 
( |38| ) into eqs. (|22]), we obtain for the moments of the ground-state energy Eq of a random problem: 

2 ^ (10-37r)jV 

where (ab) c = (ab) — (a)(6) is the connected moment; the approximate forms are valid for large N. 

Similarly, from eqs. (|l^) we get for the moments of the excitation energy D, for a random pairing 
in a random topology, 

<d> = '-^--^i ) -y-^+i (40) 



iV 4 




2iv y 


2 




TV J 


TV 2 


17 AT 1 


N + 




+ 12 + 3 


2 






17 N 
12 ' 


(11 


- 3tt)7V 





<D >c ~ 12 ■ (42) 
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N + 2 , „ / 2N \ 1 UN + 5 N r— 13N 



<E D> = ^-^4*( j - ™\ 2 - ~ ^VttTV-^, (4:-{| 



Averaging the combined moment, eq. fl24p, over topology yields 

27V \ -1 _ 137V + 5 _ 

iV J 12 4 V " JV 12 

i w / 2iv y 2 ^ (10-3 

6 ' 4 \ iV y ~ 12 

Combining this with the moments of Eq and D, we get for the moments of the total energy, E = 
E0 + D, the following simple results 

N 

<E> = -, (45) 

< E2 > = X + <«> 

<E 2 > C = -, (47) 

which can be understood by noting that a random pairing in a random topology corresponds to a 
set of N lines with random endpoints. Then each line has a uniform length distribution between 
and 1, and E is their total length. 

Computing the corresponding standard deviations, we have for a typical random problem (and a 
random pairing, for D and E) at large N, 



E ~ l -V^N ± IJ^-tt) X. (4Si 



D ~ J-I^N±^(^-,)n, (49) 



and we see that i? scales as v iV, while E 1 and I? scale as iV, while the standard deviation scales as 
s/N in each case. 

Of interest are also the correlations at large N, given to order N by 

N 

<E D> C ~ -_(10-3tt), 

<E E> C ~ 0, (51) 
<2>2*> e ~ * 

indicating that i? and Eo become uncorrelated for a random pairing of a large random problem. 



4.5 Statistics for Individual Degeneracies 



An interesting but more difficult thing to compute is the average fraction j n —< g n > /TV! of 
pairings having a given excitation energy D — n, in particular the ground-state fraction 70, which 
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gives the average probability of hitting a ground-state by chance. Since g n is simply related to a r , 
we will start by considering 

' 2N ' 



, N 

Q-n = — — — < a„ >, (52) 



in terms of which j n can be expressed as 

_ Nl .„. / -V I 

7n (27V 

We then have 



ImEH TO JV r r )a n - m . (53) 



[P] k [P] k>l v 7 

A generating function for the iV-dependence of a n is then 

A n (z) = £ * N <*n(N) = £ T II W» + ( ^ + p" +1 1 ) ■ (55) 

JV [P] 1 fe>l ^ +1 7 



Again, starting the Pk summation at a large ko and proceeding in order of decreasing k, gives 
convergence of the full sum in the limit fco — oo. Denoting by w j ^i n _ 1 the result of summing above 
a certain k, we have by eq. (|2g|) , 

Wfe_i = - — — , (56) 

1 - Wk 

and the final result is 

Mz) = £ jr^n = - log (1 - Wn(*)) • (57) 
Pi 1 

The recursion relation ( |56"| ) can now be linearized by assuming ojk = Pk/ qk, which can be solved e.g. 

by 

q k -i = qk-Pk, (58) 
Pk-i = kzq kl (59) 

which, upon eliminating pk gives 

kzq k - q k -i + qk-2 = 0. (60) 

By partial integration, it is simple to prove that the following sequence of integrals solves the 
recursion relation (J6fj) for qk, 



{-iy) k ( v 



,2 



q k = Im / — — — exp I -z— +iy)dy, k>0, z > 0, (61) 



which can be extended to negative k by recursion. The qk can be expanded in z as 

»H ... V .. / J , \ (62) 



ELir /2i (i TO 1 )(2™-D!!^,fc<o, 
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where [] denotes integer part. Note that for non-negative k the series is an asymptotic one, while 
for negative k it is finite. In particular, we have g_i = 1. An independent solution to (|6^) is given 

by 

$k(z) = -jT-q-k-i{-z), (63) 
but this solution is irrelevant, having the wrong large-fc behaviour. 

In terms of qk we now have 

l-0J k = qk-i/qk, (64) 
A n = log (q n ) - log (q„-i) , (65) 

which should then be expanded in powers of z to yield a n (N) as the coefficient of z N . In particular, 
we have A — log(go), where qo — 1 + z + 3z 2 + 15z 3 + . . .. 



In table [| results are displayed for the average degeneracy of the two lowest states, based on an 
expansion of Aq and A\ in powers of z. It is easy to check for small N, using table ^ that g m / ^ gk 



N 


Kn = 5 ( 2 iV ) 


K N < go > 


K N < g 1 > 


<go> 


< 5i > 


7o 


7i 


1 


1 


1 





1.00000 


0.00000 


1.000000 


0.000000 


2 


3 


5 


1 


1.66667 


0.33333 


0.833333 


0.166667 


3 


10 


37 


22 


3.70000 


2.20000 


0.616667 


0.366667 


4 


35 


353 


411 


10.0857 


11.7429 


0.420238 


0.489286 


5 


126 


4081 


7676 


32.3889 


60.9206 


0.269907 


0.507672 


6 


462 


55205 


149741 


119.491 


324.115 


0.165960 


0.450159 


7 


1716 


854197 


3.09875xl0 6 


497.784 


1805.80 


0.098767 


0.358294 


8 


6435 


1.4876xl0 7 


6.84187xl0 7 


2311.74 


10632.3 


0.057335 


0.263697 


9 


24310 


2.88019x10 s 


1.61447xl0 9 


11847.7 


66411.7 


0.032649 


0.183013 


10 


92378 


6.13891xl0 9 


4.07031 xlO 10 


66454.3 


440614. 


0.018313 


0.121421 


11 


352716 


1.42882x10" 


1.09496xl0 12 


405092. 


3.10436xl0 6 


0.010148 


0.077771 


12 


1.35208xl0 6 


3.60668 xlO 12 


3.13708xl0 13 


2.66751 xlO 6 


2.32019xl0 7 


0.005569 


0.048438 


13 


5.20030xl0 6 


9.81584xl0 13 


9.55147xl0 14 


1.88755 xlO 7 


1.83672x10 s 


0.003031 


0.029496 


14 


2.00583xl0 7 


2.86562 xlO 15 


3.08337xl0 16 


1.42865xl0 8 


1.5372xl0 9 


0.001639 


0.017633 



Table 2: Results for the average degeneracy of the lowest energy-states for various system sizes N. 
The second column gives an integer normalization factor Kn- Dividing the integers in the next two 
columns by Kn yields the average number of ground-states < go > and first excited states < g\ >, 
respectively. Dividing these by N\ yields 70 and 71. 



for m = 0, 1, averaged over topologies with the proper probabilities, indeed agrees with j m of table 
||, obtained from the expansion of Ak- 

The result for 70 strongly indicates an asymptotic behaviour of 70 ~ 2N2~ N . This corresponds to 
an exponential decrease with N of the average probability for a random pairing to hit a ground- 
state. However, the average number of ground-states grows faster than exponentially: < go >^ 
2N N\ 2- N . 

This abundance of ground-states indicates that a corresponding restricted problem might well have 
a solution with a locally minimal waiting-time, if the restrictions are not too severe; this is used in 
ref. J|] to simplify the solution of a set of restricted crew scheduling problems. 
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By reducing the state-space of a restricted problem to the set of ground-states of the corresponding 
unrestricted problem, the average information gained at each airport is given by log 70, which for a 
large airport roughly yields TV log 2. Summing this over several airports yields a total information 
gain scaling as TV/ log 2, with TV/ the total number of flights. Partly, this is due to a grouping of 
flights, which contributes an average information gain of N a log 2, with N a the number of airports. 



5 Conclusions 

We have performed a statistical analysis of an ensemble of random unrestricted local crew scheduling 
problems, formulated in terms of mapping arrivals onto departures at a single airport so as to 
minimize waiting-time. 

For the ground-state energy E of a large random problem, we find that both the average and 
the fluctuations scale like y/~N. For a random pairing of such a problem, on the other hand, the 
excitation D and the total energy E = E n + D both grow linearly with N, with fluctuations scaling 
like VN. 

The individual degeneracies of the lowest energy states for random problems are such that the 
average probability for hitting an optimal pairing by chance decreases like 2N2~ N for large N. 
Since the total number of pairings grows like TV!, the average number of ground-states grows very 
fast with system size. 

The results and the methods of analysis described in this paper are useful when designing effi- 
cient algorithms for the crew scheduling problems with global restrictions, by providing means for 
estimating the difficulty of a given problem, and for understanding and simplifying its structure. 

The optimal crew waiting-time for a restricted problem is bounded from below by the ground- 
state energy of the corresponding unrestricted problem, which is useful for gauging algorithmic 
performance for problems of realistic size. Due to a faster than exponential growth of the number 
of ground-states with problem size, this bound is often saturated. This can be used to simplify a 
restricted problem: By insisting on the local ground-state energy, airports can be split into several 
parts. For a large random problem, this results on the average in a reduction of state-space size by 
a factor of two for each airport. 

Some of the calculations in this paper are based on novel methods of a general nature, that may 
have applications also in other contexts with a similar topological structure. 
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